Complete inverted files for efficient text retrieval and analysis
نویسندگان
چکیده
منابع مشابه
Using Inverted Files to Compress Text
This is the first report on a new approach to text compression. It consists of representing the text file with compressed inverted file index in conjunction with very compact lexicon, where lexicon includes every word in the text. The index is compressed using standard index compression techniques, and lexicon is compressed by original dictionary compression method that gives better compression...
متن کاملOptimistic Concurrency Control for Inverted Files in Text Databases
Inverted files are frequently used as index data structures for very large text databases. Most applications of this data structure are for read-only query operations. However, the problem of introducing update operations has deserved little attention so far and yet it has important applications. In this paper we propose an optimistic concurrency control algorithm devised to handle mixes of upd...
متن کاملParallel Generation of Inverted Files for Distributed Text Collections
We present a scalable algorithm for the parallel computation of inverted files for large text collections. The algorithm takes into account an environment of a high bandwidth network of workstations with a shared-nothing memory organization. The text collection is assumed to be evenly distributed among the disks of the various workstations. Compression is used to save space in main memory (wher...
متن کاملA Complete Path Representation Method with a Modified Inverted Index for Efficient Retrieval of XML Documents
Compiling documents in extensible markup language (XML) increasingly requires access to data services which provide both rapid response and the precise use of search engines. Efficient data service should be based on a skillful representation that can support low complexity and high precision search capabilities. In this paper, a novel complete path representation (CPR) associated with a modifi...
متن کاملChallenging Ubiquitous Inverted Files
Stand-alone ranking systems based on highly optimized inverted file structures are generally considered ‘the’ solution for building search engines. Observing various developments in software and hardware, we argue however that IR research faces a complex engineering problem in the quest for more flexible yet efficient retrieval systems. We propose to base the development of retrieval systems on...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of the ACM
سال: 1987
ISSN: 0004-5411,1557-735X
DOI: 10.1145/28869.28873